792 research outputs found
Scalable Audience Reach Estimation in Real-time Online Advertising
Online advertising has been introduced as one of the most efficient methods
of advertising throughout the recent years. Yet, advertisers are concerned
about the efficiency of their online advertising campaigns and consequently,
would like to restrict their ad impressions to certain websites and/or certain
groups of audience. These restrictions, known as targeting criteria, limit the
reachability for better performance. This trade-off between reachability and
performance illustrates a need for a forecasting system that can quickly
predict/estimate (with good accuracy) this trade-off. Designing such a system
is challenging due to (a) the huge amount of data to process, and, (b) the need
for fast and accurate estimates. In this paper, we propose a distributed fault
tolerant system that can generate such estimates fast with good accuracy. The
main idea is to keep a small representative sample in memory across multiple
machines and formulate the forecasting problem as queries against the sample.
The key challenge is to find the best strata across the past data, perform
multivariate stratified sampling while ensuring fuzzy fall-back to cover the
small minorities. Our results show a significant improvement over the uniform
and simple stratified sampling strategies which are currently widely used in
the industry
High-dimensional Sparse Inverse Covariance Estimation using Greedy Methods
In this paper we consider the task of estimating the non-zero pattern of the
sparse inverse covariance matrix of a zero-mean Gaussian random vector from a
set of iid samples. Note that this is also equivalent to recovering the
underlying graph structure of a sparse Gaussian Markov Random Field (GMRF). We
present two novel greedy approaches to solving this problem. The first
estimates the non-zero covariates of the overall inverse covariance matrix
using a series of global forward and backward greedy steps. The second
estimates the neighborhood of each node in the graph separately, again using
greedy forward and backward steps, and combines the intermediate neighborhoods
to form an overall estimate. The principal contribution of this paper is a
rigorous analysis of the sparsistency, or consistency in recovering the
sparsity pattern of the inverse covariance matrix. Surprisingly, we show that
both the local and global greedy methods learn the full structure of the model
with high probability given just samples, which is a
\emph{significant} improvement over state of the art -regularized
Gaussian MLE (Graphical Lasso) that requires samples. Moreover,
the restricted eigenvalue and smoothness conditions imposed by our greedy
methods are much weaker than the strong irrepresentable conditions required by
the -regularization based methods. We corroborate our results with
extensive simulations and examples, comparing our local and global greedy
methods to the -regularized Gaussian MLE as well as the Neighborhood
Greedy method to that of nodewise -regularized linear regression
(Neighborhood Lasso).Comment: Accepted to AI STAT 2012 for Oral Presentatio
- β¦